Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 108
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Proteome Res ; 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38652578

RESUMO

Searching for tandem mass spectrometry proteomics data against a database is a well-established method for assigning peptide sequences to observed spectra but typically cannot identify peptides harboring unexpected post-translational modifications (PTMs). Open modification searching aims to address this problem by allowing a spectrum to match a peptide even if the spectrum's precursor mass differs from the peptide mass. However, expanding the search space in this way can lead to a loss of statistical power to detect peptides. We therefore developed a method, called CONGA (combining open and narrow searches with group-wise analysis), that takes into account results from both types of searches─a traditional "narrow window" search and an open modification search─while carrying out rigorous false discovery rate control. The result is an algorithm that provides the best of both worlds: the ability to detect unexpected PTMs without a concomitant loss of power to detect unmodified peptides.

2.
bioRxiv ; 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38617345

RESUMO

Membrane-bound particles in plasma are composed of exosomes, microvesicles, and apoptotic bodies and represent ~1-2% of the total protein composition. Proteomic interrogation of this subset of plasma proteins augments the representation of tissue-specific proteins, representing a "liquid biopsy," while enabling the detection of proteins that would otherwise be beyond the dynamic range of liquid chromatography-tandem mass spectrometry of unfractionated plasma. We have developed an enrichment strategy (Mag-Net) using hyper-porous strong-anion exchange magnetic microparticles to sieve membrane-bound particles from plasma. The Mag-Net method is robust, reproducible, inexpensive, and requires <100 µL plasma input. Coupled to a quantitative data-independent mass spectrometry analytical strategy, we demonstrate that we can collect results for >37,000 peptides from >4,000 plasma proteins with high precision. Using this analytical pipeline on a small cohort of patients with neurodegenerative disease and healthy age-matched controls, we discovered 204 proteins that differentiate (q-value < 0.05) patients with Alzheimer's disease dementia (ADD) from those without ADD. Our method also discovered 310 proteins that were different between Parkinson's disease and those with either ADD or healthy cognitively normal individuals. Using machine learning we were able to distinguish between ADD and not ADD with a mean ROC AUC = 0.98 ± 0.06.

3.
Nat Commun ; 15(1): 1027, 2024 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-38310092

RESUMO

Fluorescent in situ hybridization (FISH) is a powerful method for the targeted visualization of nucleic acids in their native contexts. Recent technological advances have leveraged computationally designed oligonucleotide (oligo) probes to interrogate > 100 distinct targets in the same sample, pushing the boundaries of FISH-based assays. However, even in the most highly multiplexed experiments, repetitive DNA regions are typically not included as targets, as the computational design of specific probes against such regions presents significant technical challenges. Consequently, many open questions remain about the organization and function of highly repetitive sequences. Here, we introduce Tigerfish, a software tool for the genome-scale design of oligo probes against repetitive DNA intervals. We showcase Tigerfish by designing a panel of 24 interval-specific repeat probes specific to each of the 24 human chromosomes and imaging this panel on metaphase spreads and in interphase nuclei. Tigerfish extends the powerful toolkit of oligo-based FISH to highly repetitive DNA.


Assuntos
DNA , Sequências Repetitivas de Ácido Nucleico , Humanos , Hibridização in Situ Fluorescente/métodos , DNA/genética , Sequências Repetitivas de Ácido Nucleico/genética , Sondas de Oligonucleotídeos/genética , Sondas de DNA/genética , Oligonucleotídeos/genética
4.
J Proteome Res ; 22(11): 3427-3438, 2023 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-37861703

RESUMO

Quantitative measurements produced by tandem mass spectrometry proteomics experiments typically contain a large proportion of missing values. Missing values hinder reproducibility, reduce statistical power, and make it difficult to compare across samples or experiments. Although many methods exist for imputing missing values, in practice, the most commonly used methods are among the worst performing. Furthermore, previous benchmarking studies have focused on relatively simple measurements of error such as the mean-squared error between imputed and held-out values. Here we evaluate the performance of commonly used imputation methods using three practical, "downstream-centric" criteria. These criteria measure the ability to identify differentially expressed peptides, generate new quantitative peptides, and improve the peptide lower limit of quantification. Our evaluation comprises several experiment types and acquisition strategies, including data-dependent and data-independent acquisition. We find that imputation does not necessarily improve the ability to identify differentially expressed peptides but that it can identify new quantitative peptides and improve the peptide lower limit of quantification. We find that MissForest is generally the best performing method per our downstream-centric criteria. We also argue that existing imputation methods do not properly account for the variance of peptide quantifications and highlight the need for methods that do.


Assuntos
Algoritmos , Proteômica , Proteômica/métodos , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem , Peptídeos/análise
5.
Mol Cell ; 83(15): 2624-2640, 2023 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-37419111

RESUMO

The four-dimensional nucleome (4DN) consortium studies the architecture of the genome and the nucleus in space and time. We summarize progress by the consortium and highlight the development of technologies for (1) mapping genome folding and identifying roles of nuclear components and bodies, proteins, and RNA, (2) characterizing nuclear organization with time or single-cell resolution, and (3) imaging of nuclear organization. With these tools, the consortium has provided over 2,000 public datasets. Integrative computational models based on these data are starting to reveal connections between genome structure and function. We then present a forward-looking perspective and outline current aims to (1) delineate dynamics of nuclear architecture at different timescales, from minutes to weeks as cells differentiate, in populations and in single cells, (2) characterize cis-determinants and trans-modulators of genome organization, (3) test functional consequences of changes in cis- and trans-regulators, and (4) develop predictive models of genome structure and function.


Assuntos
Núcleo Celular , Genoma , Genoma/genética , Núcleo Celular/genética , Núcleo Celular/metabolismo , Cromatina/metabolismo
6.
Genome Biol ; 24(1): 134, 2023 06 06.
Artigo em Inglês | MEDLINE | ID: mdl-37280678

RESUMO

Recent deep learning models that predict the Hi-C contact map from DNA sequence achieve promising accuracy but cannot generalize to new cell types and or even capture differences among training cell types. We propose Epiphany, a neural network to predict cell-type-specific Hi-C contact maps from widely available epigenomic tracks. Epiphany uses bidirectional long short-term memory layers to capture long-range dependencies and optionally a generative adversarial network architecture to encourage contact map realism. Epiphany shows excellent generalization to held-out chromosomes within and across cell types, yields accurate TAD and interaction calls, and predicts structural changes caused by perturbations of epigenomic signals.


Assuntos
Cromossomos , Epigenômica , Redes Neurais de Computação , Cromatina
7.
J Proteome Res ; 22(7): 2172-2178, 2023 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-37261867

RESUMO

Controlling the false discovery rate (FDR) among discoveries from a tandem mass spectrometry proteomics experiment using target decoy competition (TDC) controls only the proportion of false discoveries in an average sense. Thus, for any particular analysis, even with a valid FDR control procedure, the proportion of false discoveries (the FDP) may be higher than the specified FDR threshold. We demonstrate this phenomenon using real data and describe two recently developed methods that help bridge the gap between controlling the expected or average rate of false discoveries and the empirical rate (FDP). The FDP Stepdown method controls the FDP at any desired confidence level, and the TDC Uniform Band provides a confidence, or upper prediction bound, on the FDP in TDC's list of discoveries.


Assuntos
Algoritmos , Proteômica , Bases de Dados de Proteínas , Proteômica/métodos , Espectrometria de Massas em Tandem
8.
bioRxiv ; 2023 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-37205597

RESUMO

Background: The number and escape levels of genes that escape X chromosome inactivation (XCI) in female somatic cells vary among tissues and cell types, potentially contributing to specific sex differences. Here we investigate the role of CTCF, a master chromatin conformation regulator, in regulating escape from XCI. CTCF binding profiles and epigenetic features were systematically examined at constitutive and facultative escape genes using mouse allelic systems to distinguish the inactive X (Xi) and active X (Xa) chromosomes. Results: We found that escape genes are located inside domains flanked by convergent arrays of CTCF binding sites, consistent with the formation of loops. In addition, strong and divergent CTCF binding sites often located at the boundaries between escape genes and adjacent neighbors subject to XCI would help insulate domains. Facultative escapees show clear differences in CTCF binding dependent on their XCI status in specific cell types/tissues. Concordantly, deletion but not inversion of a CTCF binding site at the boundary between the facultative escape gene Car5b and its silent neighbor Siah1b resulted in loss of Car5b escape. Reduced CTCF binding and enrichment of a repressive mark over Car5b in cells with a boundary deletion indicated loss of looping and insulation. In mutant lines in which either the Xi-specific compact structure or its H3K27me3 enrichment was disrupted, escape genes showed an increase in gene expression and associated active marks, supporting the roles of the 3D Xi structure and heterochromatic marks in constraining levels of escape. Conclusion: Our findings indicate that escape from XCI is modulated both by looping and insulation of chromatin via convergent arrays of CTCF binding sites and by compaction and epigenetic features of the surrounding heterochromatin.

9.
bioRxiv ; 2023 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-36945528

RESUMO

Fluorescent in situ hybridization (FISH) is a powerful method for the targeted visualization of nucleic acids in their native contexts. Recent technological advances have leveraged computationally designed oligonucleotide (oligo) probes to interrogate >100 distinct targets in the same sample, pushing the boundaries of FISH-based assays. However, even in the most highly multiplexed experiments, repetitive DNA regions are typically not included as targets, as the computational design of specific probes against such regions presents significant technical challenges. Consequently, many open questions remain about the organization and function of highly repetitive sequences. Here, we introduce Tigerfish, a software tool for the genome-scale design of oligo probes against repetitive DNA intervals. We showcase Tigerfish by designing a panel of 24 interval-specific repeat probes specific to each of the 24 human chromosomes and imaging this panel on metaphase spreads and in interphase nuclei. Tigerfish extends the powerful toolkit of oligo-based FISH to highly repetitive DNA.

10.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36594573

RESUMO

MOTIVATION: We address the challenge of inferring a consensus 3D model of genome architecture from Hi-C data. Existing approaches most often rely on a two-step algorithm: first, convert the contact counts into distances, then optimize an objective function akin to multidimensional scaling (MDS) to infer a 3D model. Other approaches use a maximum likelihood approach, modeling the contact counts between two loci as a Poisson random variable whose intensity is a decreasing function of the distance between them. However, a Poisson model of contact counts implies that the variance of the data is equal to the mean, a relationship that is often too restrictive to properly model count data. RESULTS: We first confirm the presence of overdispersion in several real Hi-C datasets, and we show that the overdispersion arises even in simulated datasets. We then propose a new model, called Pastis-NB, where we replace the Poisson model of contact counts by a negative binomial one, which is parametrized by a mean and a separate dispersion parameter. The dispersion parameter allows the variance to be adjusted independently from the mean, thus better modeling overdispersed data. We compare the results of Pastis-NB to those of several previously published algorithms, both MDS-based and statistical methods. We show that the negative binomial inference yields more accurate structures on simulated data, and more robust structures than other models across real Hi-C replicates and across different resolutions. AVAILABILITY AND IMPLEMENTATION: A Python implementation of Pastis-NB is available at https://github.com/hiclib/pastis under the BSD license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Genoma , Funções Verossimilhança
11.
Stem Cell Reports ; 18(1): 159-174, 2023 01 10.
Artigo em Inglês | MEDLINE | ID: mdl-36493778

RESUMO

Vascular endothelial cells are a mesoderm-derived lineage with many essential functions, including angiogenesis and coagulation. The gene-regulatory mechanisms underpinning endothelial specialization are largely unknown, as are the roles of chromatin organization in regulating endothelial cell transcription. To investigate the relationships between chromatin organization and gene expression, we induced endothelial cell differentiation from human pluripotent stem cells and performed Hi-C and RNA-sequencing assays at specific time points. Long-range intrachromosomal contacts increase over the course of differentiation, accompanied by widespread heteroeuchromatic compartment transitions that are tightly associated with transcription. Dynamic topologically associating domain boundaries strengthen and converge on an endothelial cell state, and function to regulate gene expression. Chromatin pairwise point interactions (DNA loops) increase in frequency during differentiation and are linked to the expression of genes essential to vascular biology. Chromatin dynamics guide transcription in endothelial cell development and promote the divergence of endothelial cells from cardiomyocytes.


Assuntos
Cromatina , Células Endoteliais , Humanos , Diferenciação Celular/genética , Regulação da Expressão Gênica
12.
Am J Clin Pathol ; 157(5): 748-757, 2022 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-35512256

RESUMO

OBJECTIVES: Standard implementations of amyloid typing by liquid chromatography-tandem mass spectrometry use capabilities unavailable to most clinical laboratories. To improve accessibility of this testing, we explored easier approaches to tissue sampling and data processing. METHODS: We validated a typing method using manual sampling in place of laser microdissection, pairing the technique with a semiquantitative measure of sampling adequacy. In addition, we created an open-source data processing workflow (Crux Pipeline) for clinical users. RESULTS: Cases of amyloidosis spanning the major types were distinguishable with 100% specificity using measurements of individual amyloidogenic proteins or in combination with the ratio of λ and κ constant regions. Crux Pipeline allowed for rapid, batched data processing, integrating the steps of peptide identification, statistical confidence estimation, and label-free protein quantification. CONCLUSIONS: Accurate mass spectrometry-based amyloid typing is possible without laser microdissection. To facilitate entry into solid tissue proteomics, newcomers can leverage manual sampling approaches in combination with Crux Pipeline and related tools.


Assuntos
Amiloidose , Espectrometria de Massas em Tandem , Amiloide/análise , Proteínas Amiloidogênicas , Amiloidose/diagnóstico , Humanos , Microdissecção , Espectrometria de Massas em Tandem/métodos
13.
J Proteome Res ; 21(6): 1382-1391, 2022 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-35549345

RESUMO

Advances in library-based methods for peptide detection from data-independent acquisition (DIA) mass spectrometry have made it possible to detect and quantify tens of thousands of peptides in a single mass spectrometry run. However, many of these methods rely on a comprehensive, high-quality spectral library containing information about the expected retention time and fragmentation patterns of peptides in the sample. Empirical spectral libraries are often generated through data-dependent acquisition and may suffer from biases as a result. Spectral libraries can be generated in silico, but these models are not trained to handle all possible post-translational modifications. Here, we propose a false discovery rate-controlled spectrum-centric search workflow to generate spectral libraries directly from gas-phase fractionated DIA tandem mass spectrometry data. We demonstrate that this strategy is able to detect phosphorylated peptides and can be used to generate a spectral library for accurate peptide detection and quantitation in wide-window DIA data. We compare the results of this search workflow to other library-free approaches and demonstrate that our search is competitive in terms of accuracy and sensitivity. These results demonstrate that the proposed workflow has the capacity to generate spectral libraries while avoiding the limitations of other methods.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Biblioteca de Peptídeos , Peptídeos/análise , Processamento de Proteína Pós-Traducional , Proteoma/análise , Espectrometria de Massas em Tandem/métodos , Fluxo de Trabalho
15.
Nat Rev Genet ; 23(3): 169-181, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34837041

RESUMO

The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the application of supervised learning in genomics research. However, the assumptions behind the statistical models and performance evaluations in ML software frequently are not met in biological systems. In this Review, we illustrate the impact of several common pitfalls encountered when applying supervised ML in genomics. We explore how the structure of genomics data can bias performance evaluations and predictions. To address the challenges associated with applying cutting-edge ML methods to genomics, we describe solutions and appropriate use cases where ML modelling shows great potential.


Assuntos
Genômica/métodos , Aprendizado de Máquina , Animais , Genômica/normas , Genômica/tendências , Humanos , Aprendizado de Máquina/normas , Modelos Estatísticos , Software
16.
Elife ; 102021 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-34734806

RESUMO

A longstanding hypothesis is that chromatin fiber folding mediated by interactions between nearby nucleosomes represses transcription. However, it has been difficult to determine the relationship between local chromatin fiber compaction and transcription in cells. Further, global changes in fiber diameters have not been observed, even between interphase and mitotic chromosomes. We show that an increase in the range of local inter-nucleosomal contacts in quiescent yeast drives the compaction of chromatin fibers genome-wide. Unlike actively dividing cells, inter-nucleosomal interactions in quiescent cells require a basic patch in the histone H4 tail. This quiescence-specific fiber folding globally represses transcription and inhibits chromatin loop extrusion by condensin. These results reveal that global changes in chromatin fiber compaction can occur during cell state transitions, and establish physiological roles for local chromatin fiber folding in regulating transcription and chromatin domain formation.


Assuntos
Montagem e Desmontagem da Cromatina , Cromatina/genética , Saccharomyces cerevisiae/genética , Adenosina Trifosfatases , Cromatina/metabolismo , Proteínas de Ligação a DNA , Histonas/química , Histonas/metabolismo , Complexos Multiproteicos , Nucleossomos/metabolismo , Dobramento de Proteína , Saccharomyces cerevisiae/crescimento & desenvolvimento , Transcrição Gênica
17.
Genome Biol ; 22(1): 279, 2021 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-34579774

RESUMO

BACKGROUND: Mammalian development is associated with extensive changes in gene expression, chromatin accessibility, and nuclear structure. Here, we follow such changes associated with mouse embryonic stem cell differentiation and X inactivation by integrating, for the first time, allele-specific data from these three modalities obtained by high-throughput single-cell RNA-seq, ATAC-seq, and Hi-C. RESULTS: Allele-specific contact decay profiles obtained by single-cell Hi-C clearly show that the inactive X chromosome has a unique profile in differentiated cells that have undergone X inactivation. Loss of this inactive X-specific structure at mitosis is followed by its reappearance during the cell cycle, suggesting a "bookmark" mechanism. Differentiation of embryonic stem cells to follow the onset of X inactivation is associated with changes in contact decay profiles that occur in parallel on both the X chromosomes and autosomes. Single-cell RNA-seq and ATAC-seq show evidence of a delay in female versus male cells, due to the presence of two active X chromosomes at early stages of differentiation. The onset of the inactive X-specific structure in single cells occurs later than gene silencing, consistent with the idea that chromatin compaction is a late event of X inactivation. Single-cell Hi-C highlights evidence of discrete changes in nuclear structure characterized by the acquisition of very long-range contacts throughout the nucleus. Novel computational approaches allow for the effective alignment of single-cell gene expression, chromatin accessibility, and 3D chromosome structure. CONCLUSIONS: Based on trajectory analyses, three distinct nuclear structure states are detected reflecting discrete and profound simultaneous changes not only to the structure of the X chromosomes, but also to that of autosomes during differentiation. Our study reveals that long-range structural changes to chromosomes appear as discrete events, unlike progressive changes in gene expression and chromatin accessibility.


Assuntos
Diferenciação Celular/genética , Expressão Gênica , Células-Tronco Embrionárias Murinas/metabolismo , Inativação do Cromossomo X , Alelos , Animais , Ciclo Celular , Linhagem Celular , Núcleo Celular/genética , Feminino , Genoma , Masculino , Camundongos , RNA-Seq , Análise de Célula Única , Cromossomo X/química
18.
J Proteome Res ; 20(9): 4621-4624, 2021 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-34342226

RESUMO

The volume of proteomics and mass spectrometry data available in public repositories continues to grow at a rapid pace as more researchers embrace open science practices. Open access to the data behind scientific discoveries has become critical to validate published findings and develop new computational tools. Here, we present ppx, a Python package that provides easy, programmatic access to the data stored in ProteomeXchange repositories, such as PRIDE and MassIVE. The ppx package can be used as either a command line tool or a Python package to retrieve the files and metadata associated with a project when provided its identifier. To demonstrate how ppx enhances reproducible research, we used ppx within a Snakemake workflow to reanalyze a published data set with the open modification search tool ANN-SoLo and compared our reanalysis to the original results. We show that ppx readily integrates into workflows, and our reanalysis produced results consistent with the original analysis. We envision that ppx will be a valuable tool for creating reproducible analyses, providing tool developers easy access to data for development, testing, and benchmarking, and enabling the use of mass spectrometry data in data-intensive analyses. The ppx package is freely available and open source under the MIT license at https://github.com/wfondrie/ppx.


Assuntos
Proteômica , Software , Espectrometria de Massas , Metadados , Ferramenta de Busca
19.
J Proteome Res ; 20(8): 4153-4164, 2021 08 06.
Artigo em Inglês | MEDLINE | ID: mdl-34236864

RESUMO

The standard proteomics database search strategy involves searching spectra against a peptide database and estimating the false discovery rate (FDR) of the resulting set of peptide-spectrum matches. One assumption of this protocol is that all the peptides in the database are relevant to the hypothesis being investigated. However, in settings where researchers are interested in a subset of peptides, alternative search and FDR control strategies are needed. Recently, two methods were proposed to address this problem: subset-search and all-sub. We show that both methods fail to control the FDR. For subset-search, this failure is due to the presence of "neighbor" peptides, which are defined as irrelevant peptides with a similar precursor mass and fragmentation spectrum as a relevant peptide. Not considering neighbors compromises the FDR estimate because a spectrum generated by an irrelevant peptide can incorrectly match well to a relevant peptide. Therefore, we have developed a new method, "subset-neighbor search" (SNS), that accounts for neighbor peptides. We show evidence that SNS controls the FDR when neighbors are present and that SNS outperforms group-FDR, the only other method that appears to control the FDR relative to a subset of relevant peptides.


Assuntos
Algoritmos , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Humanos , Peptídeos , Proteômica
20.
Environ Microbiol ; 23(7): 3840-3866, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33760340

RESUMO

Colwellia psychrerythraea is a marine psychrophilic bacterium known for its remarkable ability to maintain activity during long-term exposure to extreme subzero temperatures and correspondingly high salinities in sea ice. These microorganisms must have adaptations to both high salinity and low temperature to survive, be metabolically active, or grow in the ice. Here, we report on an experimental design that allowed us to monitor culturability, cell abundance, activity and proteomic signatures of C. psychrerythraea strain 34H (Cp34H) in subzero brines and supercooled sea water through long-term incubations under eight conditions with varying subzero temperatures, salinities and nutrient additions. Shotgun proteomics found novel metabolic strategies used to maintain culturability in response to each independent experimental variable, particularly in pathways regulating carbon, nitrogen and fatty acid metabolism. Statistical analysis of abundances of proteins uniquely identified in isolated conditions provide metabolism-specific protein biosignatures indicative of growth or survival in either increased salinity, decreased temperature, or nutrient limitation. Additionally, to aid in the search for extant life on other icy worlds, analysis of detected short peptides in -10°C incubations after 4 months identified over 500 potential biosignatures that could indicate the presence of terrestrial-like cold-active or halophilic metabolisms on other icy worlds.


Assuntos
Alteromonadaceae , Proteômica , Alteromonadaceae/genética , Biomarcadores , Temperatura Baixa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...